首页> 外文OA文献 >Peptide-level robust ridge regression improves estimation, sensitivity, and specificity in data-dependent quantitative label-free shotgun proteomics
【2h】

Peptide-level robust ridge regression improves estimation, sensitivity, and specificity in data-dependent quantitative label-free shotgun proteomics

机译:肽水平稳健脊回归改善了数据依赖性定量无标记鸟枪蛋白质组学的估计,灵敏度和特异性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Peptide intensities from mass spectra are increasingly used for relative quantitation of proteins in complex samples. However, numerous issues inherent to the mass spectrometry workflow turn quantitative proteomic data analysis into a crucial challenge. We and others have shown that modeling at the peptide level outperforms classical summarization-based approaches, which typically also discard a lot of proteins at the data preprocessing step. Peptide-based linear regression models, however, still suffer from unbalanced datasets due to missing peptide intensities, outlying peptide intensities and overfitting. Here, we further improve upon peptide-based models by three modular extensions: ridge regression, improved variance estimation by borrowing information across proteins with empirical Bayes and M-estimation with Huber weights. We illustrate our method on the CPTAC spike-in study and on a study comparing wild-type and ArgP knock-out Francisella tularensis proteomes. We show that the fold change estimates of our robust approach are more precise and more accurate than those from state-of-the-art summarization-based methods and peptide-based regression models, which leads to an improved sensitivity and specificity. We also demonstrate that ionization competition effects come already into play at very low spike-in concentrations and confirm that analyses with peptide-based regression methods on peptide intensity values aggregated by charge state and modification status (e.g. MaxQuant’s peptides.txt file) are slightly superior to analyses on raw peptide intensity values (e.g. MaxQuant’s evidence.txt file).
机译:来自质谱的肽强度越来越多地用于复杂样品中蛋白质的相对定量。但是,质谱工作流程固有的许多问题使定量蛋白质组学数据分析成为一项至关重要的挑战。我们和其他人已经表明,在肽水平上的建模优于基于经典摘要的方法,该方法通常还会在数据预处理步骤中丢弃大量蛋白质。然而,基于肽的线性回归模型由于缺少肽强度,离谱的肽强度和过度拟合而仍然遭受数据集不平衡的困扰。在这里,我们通过三个模块扩展进一步改进了基于肽的模型:岭回归,通过使用经验贝叶斯跨蛋白质的信息和使用Huber权重的M估计来改进方差估计。我们在CPTAC突入研究以及比较野生型和ArgP基因敲除的弗朗西斯菌tularensis蛋白质组的研究中说明了我们的方法。我们显示,与基于最新的基于摘要的方法和基于肽的回归模型相比,我们可靠的方法的倍数变化估计更加精确和准确,从而提高了灵敏度和特异性。我们还证明了在极低的掺入浓度下电离竞争效应已经发挥作用,并证实了使用基于肽的回归方法对由电荷状态和修饰状态(例如MaxQuant的eptides.txt文件)聚合的肽强度值进行分析的结果略胜一筹。分析原始肽强度值(例如,MaxQuant的idences.txt文件)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号